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NETWORKING SYSTEMS 



The present invention relates to networking systems and the forwarding and routing or information therein, 
being mnre particularly directed to the problems of a common method for managing both cell and packet or frame 
switching in the same device, having common hardware. commonQoS (Quality or Service) algorithms, common 
forwarding algorithms: building a switch that handles frame switching without interring with cell switching. 

Background of Invention 

Two architectures driving networking solutions arc cell switching and frame forwarding. Cell switching 
involves the transmission of datzrtn fixed si/.e units called cells. This is based on technology referred to as 
Asynchronous Transfer Mode (ATM). Frame forwarding transmits data in arbitrary size units referred to either as 
frames or packets. The basis of frame forwarding is used by a variety of protocols, the most noteworthy being the 
Internet Protocol (IP) suite. 

The present invention is concerned with forwarding cells and frames in a common system utilising common 
forwarding algorithms. In co-pending U.S. patent application Serial No. 581.467. Hied December 29.1995. Tor High 
Performance Universal Multi-Ported Internally Cached Dynamic Random Access Memory System. Architecture and 
Method, and co-pending U.S. patent application Serial No. 900.757, Hied July 25. 1997. for System Architecture for 
and Method of Dual Path Data Processing and Management of Packets and/or Cells and the Like, both or common 
assignee herewith, a promising solution or common cell/frame forwarding is provided. 

Most tradition*! Internet-style host-to-host data communication is carried out in variable size packet format, 
interconnected by networks (defined as a collection of switches) using packet switches called routers. Recently. 
ATM has become widely available as a technology to move data between hosts, having been developed to provide a 
common method for sending traditional telephony data as well as data for computer-to-computcr communication. 
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The p,cv,„ us M empIoycd WM t0 appIy T , mc Divijion Mu|ijpIexjng (tom) (o tdephony ^ ^ ^ ^ 
al.ocatcd a fixed amount of time on n channe.. For example, circuit A mav be located x „ Qf ^ ^ ^ 

d 3t3) . routed by y 3n a , 3 „d ,hcn x again. as laIer dcscribcd in conncc(ion wjih hereinaf(cr d . scusscd ^ 3 ^ 

each circu,, is complete,, s.nchronous. This method, however, has imrinsic limi[aljons u - „ 

since ir 3 circ.it has nothing lo send its aUocatcd b andw idth b nol uscd on ^ ATM addre „„ thi, bandwidlh 
issue by a„ow in g ,h. circuits ,o be asynchronous. Though bandwidth is s,i„ divided among fixed le nglh dau item , 

any circuit can ttansmii at any point in time. 

Hie ITU-T (.ntcmanon.-,, Te.ccommunications Union - Tc.ecommunications. formally lhe CC.TT) is an 
organi„,ion chartered by ,„c United Nations ,o provide telecommunications standards defined fo Ur Casses of 
service: ■ > Constant Bi, R,,e for C.rcui, Emotion. ,.. consIam . raIe vojce ^ ^ , } ^ Bh ^ ^ 
cerr.n voice and vidco npplicmions: „ Da , o for Conneelion . 0(jemed Trnffic; an<j ^ Da(4 ^ Connectionieji _ 
Orient Traffic. These service., ln 1Ur , „ e s , ippor|ed by „„. ^ ^ ^ ^ ^ ^ ^ 

Layers (AAL). these a™ adoption ,a y e rS are defined in ,TU-T Recommendation ,.363. T*ere « 3 defined 
,Tcs: AAL I. AAL3/4 and AAL5 . AAL2 Kn, ne,er b een defined in the ,TU-T recommendations and AAL 3 and 
AAL 4 .ere combined into one lyr e. *,,„ respect to , h . ATM ce„ moke-up. Ihcre is no way ^ ^ 

belong to one layer as opposed lo cells .hat belong to another layer. 

™= Ration ,,ve, is determined during circuit setup: i.e. when a host computer communicates to the 

AAL, has b cen defined to he used for real-time app.ications such as voice or video: whi.e AAL5 has been defined 
for use by traditional datagram oriented services such as folding ,P da , gfams . A of ^ ^ ^ 
^^^^^^^^^^^^^^ 
O. except for the ,.,s, one (as ,a,er i„ U s,ra,ed in Fig. ,, TOs is refcrTed „ „ , $egmcn|ed ^ 

Tbus. in current netting techno.ogy data is transported in either v 3ri a b ,c ,« packets or fi,ed si« ce„s 

depending on the ivpcs of swi.ching devices installed in the network R„,„ 

network. Routers can be connected to each other directly 

or through ATM networks. If connected directly, then packets are arhi.r, • u t 

packets arc arbitrary s.zc: but .r connected by ATM switches 
■hen a., packets ex.ring the ,„ u ,cr arc choppcd imf) ^ ^ ^ ^ ^ ^ 
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Network architectures based on the Internet Protocol (IP) technology are designed as a "best effort" 
service. This means that if bandwidth is available, the data gets through. If. on the other hand, bandwidth is not 
available then the data is dropped. This works well with most computer data applications such as file transfers or 
remote terminal access. This does not work well with applications that can not retransmit, or where retransmission is 
of no value, such as with video and voice. Getting a video frame out of order makes no sense, whereas Hie transfer 
applications can tolerate such anomalies. Since the packet size is arbitrary at any point in time making specific delay 
variation commitments between any two frames is almost impossible, as there is no way of predicting what type and 
size of traffic is ahead of any other type of traffic. The buffers that must handle the data, moreover, must be able to 
receive the maximum data size, meaning that that buffering scheme must be optimized to handle larger data packets 
while at the same time not wasting too much memory on smaller packets. 

ATM is designed to provide several service categories for different applications. These include Constant Bit 
Rate (CBR). Available Bit Rate (ABR), Unspecified Bit Rate (UBR) and two versions of Variable Bit Rate (VBR). 
real-time and non-real-time. These service categories are defined in terms of Traffic Parameters and QoS Parameters. 
Traffic Parameters include Peak Cell Rate (constant bandwidth). Sustainable Cell Rate (SCR). Maximum Burst Size 
(MBS). Minimum Cell Rale (MCR) and Cell Delay Variance Tolerance. QoS parameters include Cell Delay 
Variation (CDV). Cell Loss Ratio (CLR) and maximum Cell Transfer Delay (maxCTD). As an example. Constant 
Bit Rate CBR (e.g. the service used for voice and video applications) is defined as a service category that allows the 
user at call setup time to specify the PCR (peak cell rate, essentially the bandwidth), the CDV, maxCTD and CLR. 
The network must then ensure that the values requested by the user and accepted by the network are met; if they are 
met. the network is said to be supporting CBR. 

The various classes of service direct the network to provide better service for some traffic as opposed to 
other types of traffic. In ATM. with fixed length cells, switches manage bandwidth utilization on a line effectively by 
controlling the amount of data each traffic flow is allowed to put on a line at any moment in time. They generally 
have simpler buffer techniques arising from the fact that there is but one size of data unit. Another advantage is 
predictable network delays, especially queuing latencies at each switch. Since all data units are thesame size, this 
helps to ensure that such traffic QoS parameters as CDV are easily measurable in the network. In non-ATM 
networks (i.e. frame-based networks), frames can range anywhere from, say, 40 bytes to thousands of bytes, 
rendering it difficult to ensure a consistent CDV (or PDV ( Packet Delay Variation) since it is impossible to predict 
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ihc dclsys in the network, lacking consistent transfer times of individual packets. 

By carving data into smaller units. ATM can increase the ability of the network to decrease the latency of 
transmitting data from one host to another. Such also allows for easier queue and buffer management at each hop 
through the network. A disadvantage, however, is that a header is added to each cell making the effective bandwidth 
cf the network less than if the network had a larger transmission unit. For example, if 1 .000 bytes are lo be 
transferred from one host to another, then a frame-based solution would append a header (approximately 4 bytes) 
nnd u.msmit the entire frame in less than a second. In ATM. the 1 .000 bytes is chopped into 48 bytes with a 5 bytes 
header: i c. 1 .000/4R = 20.833 (or 2 1 cells). Each cell is then given a 5 byte header increasing the bytes to be 
transmitted by 5 • 2 1 = 105 extra bytes. Thus ATM effectively decreases the available bandwidth to the actual data 
hy approximately 100 bytes (or about \Q%) : the decreasing of end-to-end latency also decreases the available 
bandwidth for data transmission. 

For some applications, such as video and voice, latency is more important than bandwidth while for other 
application., such as file transfers, better bandwidth utilization increases performance rather than decreased hop-by- 
hop latency. 

Recently, ihc demands orT more bandwidth and QoS have grown many fold due lo new applications for 
muliimedia services, including ihc heforc described video and voice. This is forcing the growth of ATM networks in 
ihc core of traditional packctbnscd networks. ATM. because of its fixed packet size, brings reduced processing time 
in networks and hence faster forwarding (i.e. lower latency). It also brings with it the ability to take advantage or 
iraffic classification. S.ncc the cells, as earlier pointed out. arc of fixed size, traffic patterns can be controlled through 
QoS assignments: i.e. networks can carry traditional packets (in cell format) and constant bandwidth stream data 
(e.g. voice/video based data). 

As will subsequently he demonstrated, most conventional networking systems inherently are designed for 
either forwarding frames or cells hut not both. In accordance with the present invention, on the other hand, through 
use or novel search algorithms. QoS management and management of packet/cell architecture, both cells and frames 
can he transmitted in the same device and with significant advantage over (he prior techniques, as later more fully 
explained. 
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Objects of Invention 

An object of the present invention, accordingly, is to provide a novel system architecture and method, 
useful with any technique Tor processing data packets and/or cells simultaneously with data packets, and without 
impacting the performance aspects of cell forwarding characteristics. 

A further object is to provide such a novel architecture in which the architected switch can serve as a packet 
switch in one application and as a cell switch in another application, using the same hardware and software. 

Still a further object is to provide such a system wherein improved results are achieved in managing QoS 
characteristics for both cells and data packets simultaneously based on a common cell/data packets algorithm. 

An additional object is to provide a common parsing algorithm for forwarding cells and data packets using 
common and similar techniques. 

Other and further objects will be explained hereinafter, and are more particularly delineated in the appended 

claims. 
Summary 

In summary, from one of its important viewpoints, the invention encompasses in a data networking system 
wherein data is received as either ATM cells or arbitrarily-sized multi-protocol frames from a plurality of I/O 
modules any of which can be cell or frame interfaces, a method of processing both ATM cells or such frames in a 
native mode. i.e. not transforming frames to cells, using common algorithms for forwarding based on control 
information contained in the cell or frame and in such a manner as to preserve QoS characteristics necessary for 
correct operation of cell forwarding; processing the packet/cell control information in a forwarding engine with 
common algorithms not dependent on context-sensitive information contained in the cell or packet, and passing 
results including QoS information to an egress queue manager; passing the cell/ packet to the egress I/O transmit 
facility in such a manner as to provide a minimal cell delay variation (CDV) so as not to impact correct cell 
forwarding characteristics; and controlling the transmit facility so as to provide a common bandwidth management 
algorithm for both cell and packets and ail without impacting the correct operation of either cells or packets. 
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Drawings 

The invention will now be described in connection with the accompanying drawings in which the before- 
mentioned Fig- I is a diagram illustrating an ATM (Asynchronous Transfer Mode) cell formal; 
* 

Fig.2 is a similar diagram of an Internet Protocol (IP) frame format for 32 bit words; 
Fig. 3 is a flowchart comparing Time-Division Multiplexing (TDM). ATM and Packet Data frame 
forwarding; 

Fig. 4 is a block diagram of the switch of the invention with the cell and packet interfaces; 

Fig. 5 is a block diagram of a traditional prior art bus-based switching architecture, and Fig. 6, its memory- 
based switch data flow diagram; 

Fig. 7 is a block diagram of a traditional prior art cross-bar type switching architecture, and Fig. 8 f its cross- 
bar data flow diagram; 

Fig. 9-10 arc interface diagrams illustrating, respectively, a cell switch with a native interface card, a packet 
interface on cell switch, and an A AL5 packet interface on cell switch, all with a cross-bar or memory switch; 

Figs. 1 2 and 1 3 are similar diagrams of a packet switch with native packet interface cards and with AAL5 
interface, respectively, for NxN memory connection buses; 

Fig. 14 is a block diagram of the switch architecture of the present invention, using the word "NeoN" in 
connection with the packet and cell data switch as a trade name of NeoNET LLC, the assignee of the present 
application; 

Fig. 15 and 16 are diagrams respectively of extended parsing function flows for forwarding decisions and 
an overview of such functions and Fig. 17 is a diagram of the forwarding elements; 

Fig. 1 8 is a first stage parse graph tree lookup block diagram, and Fig. 19 is a second stage forwarding table 
lookup (FLT) diagram; 

Figs. 20 and 21 arc respective diagrams of parse graph memory on power up and of a simple illustrative IP 
multicast packet; 

Fig. 22 presents an initialized lookup table, with all entries pointing to unknown route/cell forwarding 
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informal, and Fig. 23 illume, the lookup tabic after adding an illustrative IP address (209.6.34.224/32); and 
Fig. 24 is a queuing diagram for scheduling system operation. 



Further Background To Preferred Embodiments nf Inv 



ention 



Before P-c^ E ,„ H.ustrate thc preferTed archi(ccturc of , hc . nvcn k u bei . eved iQ ^ 

the lim.tat.ons of ,hc prior and of current network systems, which ,hc present invention admirably overcomes. 

Current nonvoting scions arc designed cither for switching dam packets or ce.ls. As before stated. al , 
types of da, networking switches must receive data on an ingress port, make a folding decision. transfer dat, 
from the ingress P on ,o , hc egress port and transmit tha, data on the appropriate egress port physical interface. 
Beyond the basic da., forw, rd ,n g .pecs, there are different requirements for cell switching versus frame 
forwarding. As before stated. a„ current techno.ogy divides switching e.emen.s into three types: bridges, routers and 
switches, and in pnrticuU, ATM switches. Thc distinction between bridges and routers is b.urred in that both " 
forward datagram, and .vpicl.v mo„ routers also do bridging functions „ we,..- thus the discussion focuses on 
datagram switches (i.e. rouicrs) nnd ATM switches. 

I. is in order firs, to invcst.eate the basic architectural requirements for these two types 0 f switchine devic „ 
based on current so.ution, and then to present the reasons why current so.utions do no, provide mechanisms to How 
^uLnncous transfer of cel., and frames without severely impacting „, c co.ee. operations of either ATM switching 
or frame forwarding. n,c novel solution based on the present invention wiH then be clear. 

Routers typica.W have a wide variety of physical interfaces: LAN interfaces, such as Ethernet. Token ring 
and FDD., and widcarca interfaces, such Frame Re.ay. X. 25 . Tl and ATM. A router has methods for receiving 
frames from these various interfaces, and each interface has different frame characteristics. For examp.e. .„ Ethernet 
frame may be anywhere from 64 bytes to . 500 bytes, and an FDDI frame can be anywhere from 64 bytes ,o 
<500(inc.uding header and trailer, bytes. Tbc router's I/O module strips the header tha, is associated with ,he 
Physical interface and presents the rating frame, such as an ,P digram, to the forwarding engine. Tr,e forwarding 
engine .ooks a, the ,P destination address. Fig. 2. and makes an appwp , Me ^ ^ ^ ^ 

forwarding decision is ,„ send datagram to the egress port as determined by the forwarding tab.es. Tne egress pon 
-non attaches the approbate network-dependent header and transmits the frame out the physica, interface. Since 
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different interfaces may have different frame size requirements, a router may be required to "fragment" a frame, i.e. 
"chop" the datagram into useable size. For example, a 2000 byte FDDI frame must be fragmented into frames of 
1500 bytes or less before being sent out on a Ethernet interface- 
Current router technology offers "best effort" service. This means that there are no guarantees that 
datagrams will not be dropped in a router-based network. Furthermore, because routers transfer datagrams of varying 
sizes, there arc no per datagram delay variation or latency guarantees. Typically a router is characterized by its 
ability to transfer datagrams of a certain size. Thus, the capacity of a router may be characterized by its ability to 
transfer 64 byte frames in one second or the latency to transfer a 1500 byte frame from an ingress port to an egress 
port. This latency is characterized by last bit in. first bit out. 

An ATM switch, by comparison, has only one type of interface, i.e. ATM. An ATM switch makes 
forwarding decision by looking at a forwarding table based on VPI/VCI numbers. Fig. 1. The forwarding table is 
typically indexed by physical port number, i.e. an incoming cell with a VPI/VCI on ingress port N gets mapped to an 
egress port M with a new VPI/VCI pair. The table is managed by software elsewhere in the system. All cells, no 
matter what the ATM Adaptation Layer (AALx), have the same structure, so that if ATM switches can forward one 
AAL type, they con forward any type. 

In order to switch ATM cells, several fundamental criteria must be met. The switch must be able to make 
forwarding decisions based on control information provided in the ATM header, specifically VPI/VCI. The switch 
must provide appropriate QoS functions. The switch must provide for specific service types, in particular Constant 
Bit Rate (CBR) traffic and Variable Bit Rate (VBR). CBR (voice or video) traffic is characterized by low latency 
and more importantly low or guaranteed Cell Delay Variation (CDV) and guaranteed bandwidth. 

The three main requirements of implementing CBR type connections over a traditional packet switch arc 
low CDV, small Delay and guaranteed bandwidth. Voice, for example, consumes a fixed amount of bandwidth, 
based on the fundamental Nyquist's sampling Theorem. CDV is also part of a CBR contract, and plays a role into the 
overall Delay. CDV is the total worst case variance in expected arrival lime and actual arrival time of a packet/cell. 
In so far as an application is concerned, it wants to see data arrive equidistant in time. If, however, the network 
cannot guarantee this equidistant requirement, some hardware has to buffer data - equal or more than the worst case 
CDV amount introduced by the network. The higher the CDV, the higher is the buffer requirement and hence the 
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higher Delay: 3rd. as illustrated earlier. Delay is not good for CBR type circuits. 

Packct-bascd networks traditiona.lv qutuc da.a a, the egress based on priority of traffic. Regardless of how 
da,, is queued, traffic with low delay variation requirement will get queued behind one or more packets. Each of 
.hem could be maximum packet si,c. and this inherently contributes the most ,0 delay v ari3lion on a p3ckel . blsed 



network. 



Tnere :ue many methodologies used to manage bandwidth and priorities. From a Network Management 
point of view, a network manner usuaMy , ikes ,o eve out the total egress bandwidth into priorities. There are 
.e™, reasons for carving .His bandwidth: e.g. it ensures the manager tha, control traffic (Higher Priority .nd Low 
Bandwidth, always has room on the w.re even during very high .ine bandwidth utilization, or perhaps a CBR 
(Constam Bit Rate) traffic will be guaranteed on the wi, e . etc. 

There are numerous methods to address bandwidth per traffic priority. Broad classes of these mechanisms 
are Round Robin Queuing. Weighted Pair Queuing and Priority Queuing. Each me,hodo lo gy wi„ be chained for 
-he sake of d.scuss.on and compress of this document. ,n „, cases of queuing, traffic is pu, into queues based on 
priorities, usua.ly by a hardware engine tha, ,ooks at a cell/packe, header or control information associated with 
cen/packe, as the ce„ /p aeke, arrives from Ihe backpiane. „ is how data ,s cxtracted/de-qucued from these queues mat 
differentiates one queuing mechanism from another. 

Simple Round Robin Queuing 

This queuing mechanism empties a „ queues in a round robin fashion. This means tha, traffic is divided into 
queues and each queue gets the same fixed bandwidth. Whi.e a dear advantage is simplicity of imputation, a 
™ JO r Advantage of this queuing technique is that this mechanism complete.y loses the concept of priority. Priority 
must then be managed by buffer .location mechanisms. The only dear advantage is simp.icity of imp.ementatio. 
Weighted Round Robin 

Th,s queuing mechanism is an enhancement of 'Simple Round Robin Q otuinE •. where . weigh , „ ^ 
on each queue by the network manager during i„i liaIi , alion time . In ,„., ^ ^ ^ . ^ 

on the weigh,. ,f one queue , allocated ,0% of , he bandwidth, i, u ,„ be scrviccd of lhe ^ ^ 
,ueue m,y h», 50* of the a.located bandwidth, and wi„ be serviced 50* of the time. The m3j o r drawback , here 
« unused bandwidth on the wire when there , no traffic in a queue of the located bandwidth. This resuhs in wasted 
Handwidth. There is. moreover, no association of p 3C ke, si« in the dequeuing algorithm, which is cr U ci 3 , for 
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packch^d wi.chcv Giving equal weigh, ,o .11 packe, si«s throw, off , he b3ndwid|h a| , oca|jon 

Friority Queuing 

In .hi, queuing mcchan.sm. ou.pu, queues are serviced purely based on priority. The Highes. Priori.y 
Queue ges serviced fir,, and ,he Lowes. Priori.y Queue ge.s serviced .„,. ,„ lhis mechanism. Higher Priority 
Traf He always preempts .he Lower Priori.y Queue. The drawback of .his ,ypc of mechanism is ,ha, ,he Lower 
Prion,,- Mechanism may rc<u„ in ,cro bandwidth. TV advance of .his mechanism, besides being simple, is ,h« .he 
bandwid.h is no, was.ed: so long as .Here is da.a .o send, i, wi„ bc scnI . TW is. however, no associa.ion of packe, 
size in lhe dequeuing algori.H,,, which IS C ruci a . for packe.-based swi.ehes. Giving eq ua, weigh, ,o .„ packet sizes 
throws off lhe n 3 ndwidih alloca.ion scheme, as before noicd. 

Prom , he above exnmp.es. .here is a need ,o s.nke a balance be.ween Priori.y Queuing and Weigh.ed 
Round Rn hin Queuing, along wi.h packe. size. This ca„s for a so.u.ion provided by ,hc prcsen, inven.ion where . 
high P r,o r i, y uarnc is serviced before .„w cr P riori,y .raffle, bu, each queue is serviced a, ,eas. wi.hin i,s bandwid.h 
a..oc=„o. In add„ion ,o .he above requires,. ,he ou.pu, buffer shou.d be filled wi.h da.a from a queue even when 
• he bandu-id* of ,ha. queue is e.h»„ed. including w,h o.her bandwid.h eligible queue da.a. This .echniaue 
enforces bandwid.h per .raffle queue requiremen, and also does no, was.e bandwid.h on ,he wire and is embodied in 
lhe invcnlion 

Architcciural Issues in Switch Design 

Curren, s wi.ching so.u.ions employ ,w 0 dis.inc, so.u.ions: I ) memory and 2, cross-bar. IHese so.u.ions Be 
H.us.ra.cd in Figs. 5 and 6 showing a .radi.iona, bus-based and memory based archi.cc.urc. and in F.g. 7. showing . 

traditional cross-bnr switching architecture. 

in .he traditiona. men.ory-based solutions represented bv Fig. 5. da.a must firs, bc placed inside of main 

memory. Since severa, differen, „0 modu.es mus, transfer data to common memory, contention for .his resource 
occurs. Main memory provides bo.h a buffering mechanism and a transfer mechanism for data from one physic.. 
Pen ,o ano.hcr physica, port. ^ ra,e of transfer is ,he„ highly dependent on the speed of the egress port .nd the 
.hi.,,- of .he s,s.em ,o move da, in and ou. of main memory and .he number of interfaces ,ha, mus, access main 
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memory. 

As more fully shown in Fig. 6. ihc CPU interfaces through a common bus. with memory access, with a 
plurality of data-receiving and transmitting I/O ports # I . «. etc.. with the various dotted and dashed lines showing 
the interfacing paths and the shared memory, as is well known. As pointed out previously, the various accesses of 
the shared memory result in substantial coniention. increasing the latency and unpredictability, which is already 
substantial in this kind of architecture because the processing of the control information cannot begin until the entire 
packet/celt is received. 

Furthermore, as the accesses to the shared memory arc increased, so does the contention: and as the 
contention is increased, this results in increasing the latency of the system. In the traditional memory-based switch 
data flow diagram of Fig. 6. thus, where the access time per read or write to the memory is equal to M. and the 
numher of bits for a memory access is W, the following functions occur: 

There is the write of data from the receive port # I to shared memory. Ihc time to transfer a packet or cell is 
equal to ((B«8)AV)-M. where B is equal to the number of bytes for the packet or cell. M is the access time per read 
or write to the memory and W U the number of bits for a memory access . As the packet gets larger so does the time 
lo wtiic it lo memory. 

This means that if a packet is destined to an ATM interface as in Fig. 5. followed by a cell, the cell is 
delayed by the amount of transfer time from main memory, and in the worst case this could be N packets (where N is 
the numher of packet. non-ATM interfaces) including the contention among other reads and writes on the bus. If. for 
example. B=4000 bytes and M is RO nanoseconds (for a 64 bit-wide bus for DRAM access), then ((4000 • 8y64) * 
80 = 40.000 nanoseconds for a packet transfer queued before a cell can be sent, and OC 48 is 170 nanoseconds per 
64 byte cells. This is only if there is no contention on the bus whatsoever. In the worst case, if a switch has 16 ports 
and all the ports are contending simultaneously, then to transfer the same packet would require 640.000 nanoseconds 
just to get into the memory, and the same amount to get out- a total time of about 1 .3 milliseconds. Tnis occurs il" 
between each write into memory, another port has lo write to memory as Well. So for n=16 ports, n-1. or 15 ports 
have to gain access to memory. This means that 15 ports • 80 nanoseconds = 1 200 nanoseconds are-used by the 
system before the next transfer into memory of the original port can occur. Since there are '4000 bytes • 8 
hi.s/by,c)/64 bits = flX) accesses, each access is separated by 1200 nanoseconds, and the full transfer lakes 500 • 
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1 200 = 600.000 nanoseconds. So the total is system time plus actual transfer time which is 600.000 nanoseconds + 
40.000 nanoseconds = 640.000 nanoseconds for the transfer into memory, and another 640.000 nanoseconds out of 
memory. This calculation, moreover, docs not include any CPU contention issues or delay because of egress port 
busy, which would make this calculation e\cn larger. 

There are similar disadvantages in traditional cross-bar based solutions as shown in Fig. 7. before 
referenced, where there is no main memory, and buffering of data occurs both at the ingress port and egress port. In 
the memory-based design of Figs 5 and 6. buffer memory is shared across all ports, making for very efficient 
utilization of memory on the swhch In the cross-bar approach of Fig. 7, each port must provide a Urge amount of 
memory, so that the overall memory of the system is large as there is no common sharing of buffers. The cross-bar 
switch is only a conduit for the transfer of data from one physical port on the system to another physical port on the 
system. If l wo ports are simultaneously to transfer data to one output port, one of the two input ports must buffer the 
data thereby increasing the latency and unpred.ctabil.lv as the data from the first input port is transferred to the 
output port. The advantage of a cross-bar switch over a memory-based switch, however, is the high rate of data 
transfer from on, point to another without the inherent limitation of main memory contention on the memory-based 



switch. 



In .he traditional cross-bar switching architecture system of Fig. 7. .he CPU interfaces through a common 
bus. will, memory access, to an interface with the various dotted and dashed lines of Fig. 8 showing Ihe interfacing 
paths and the shared memory, as is well known. The CPU makes a forwarding decision based on information in .he 
data. The data must then be transmitted across the cross-bar switch fabric to the egress port. But if other Uaffic is 
being forwarded to that egress interface, then the data must be buffered in the ingress interface for so long as the 
amount or time it lakes lo transfer Ihe entire cell/packet to Ihc egress memory. There is: 

A. Write of data from the receive porl « I lo local memory. The lime to transfer a packet or cell is equal 
to «B«ft)AV)'M. where B is equal lo the number of bytes for the packet or cell. M is the access time 
per read or write in ihe memory and W is the number of bits for a memory access . As the packet gets 
larger so docs the time to write it to memory. 

B. W,i,c of data from the receive port # I to local memory of egress port #2. The lime to transfer a 
packet or cell is equal to ((B-R)AV)- M ♦ T. where B is equal to the number of bytes for the packet or 
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cell. M is the access time per read or write ,o the memory. W is the number of bits for a memory 
access and T is the transfer time of the cross-bar switch.. As the packet gets large, so does the time , 0 
transfer it across the cross bar switch and write it to locnl memory. 

Tor a packet transfer followed by a cell transfer to an egress port, the calculation is the same as for ,he 
memorybased solution of Figs. 5 and 6. The packet must be transferred to local memory ,t the «me speeds as for 
the mcmory-based solution The advantage that there is no contention for central memory, does not alleviate the 
prob.cm tha, a packet transfer in front of a cel. transfer can cause delays that prevent the proper functioning of very 

fast inierface speeds. 

The goal is to create a switching device running a, high speeds (i.e. SONET denned rates) that provides the 
required QoS. The device should ho scalable in terms of speed and pons, and the device shou.d allow f or equal ., lme 
transfer of cells and frames fmm an ingress port to an egress. 

While current designs h.nve stated to come up with very high speed routers, they have not. however,b=en 
ab.e to provide all the ATM service requirements, thus stil. maintaining a polarized se, of networking device! i.e. 
routers and ATM switches. An optima, solution is one tha, achieves very high speeds and tha, provides the required 
QoS support and has internes th„ merge ATM and Packet-bnscd technologies on the same interface. Fig. 3. This 
will allow the current investment in cither networking technology to be preserved, yet satisfy bandwidth andQcS 



demands. 



The issues in merging interfaces on a data switch port tha, accepts ATM cells and treats certain ATM cells 
» packets and others a, ATM Hows, accepts only packets on other interfaces and only cells on ye, another set of 
interfaces, is shown in later^iscusscd Fig. 4. These issues are three fold: a) Forwarding decision a, the ingress 
interface for packet and cells, h) s* itching packet and ce.ls through ,hc switch fabric and. c) managing egress 
bnndwidth on packet and cells. The present invention, based on this technique of the previous.y cited co-pending 
opplicatinns. e, P .ains how ,o create a genera, data switch ,ha, merges the ,w 0 technologies (i.e. ATM switching and 
packet switching) and solves the three issues listed above. 

Interface Issues Switch Designs 

The purpose of this section is to compare and contrast ATM and Packe.-bosed switch designs and various 
interfaces on either type of s« i,ch design. Specifically i, identifies prob.cms with both devices as ,hcy pertain to forwarding 
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packets or cells: i.e. issues with ATM switches forwarding packets, and issues with Packets switches forwarding cc || s . Rg 
3. 

Typical Design of an ATM Switch 

As previously explained, defined within the ATM standard there arc multiple ATM Adaptation Layers ( AAL I- 

AAL5) . each one specifying a different type of service from a wide spectrum or services: namely. Constant Bit Rate (CBR) 
lo UnspeciHcd Bit Rate (UBR). Constant Bit Rate (AAL I) contract guarantees minimal cell loss with low CDV. while 
Unspecified Bit Rate contract specifies no traffic parameters, and no quality orService guarantees. For the purposes or this 
invention it is convenient to limit the discussion to AAL I (CBR) and AAL5 (Fragmented Packets). 

Fig. 9 illustrates cell switching with native cell interface cards, showing different modules of a generic ATM 
Switch with native ATM interfaces. The cells living from the physical layer module (PHY) are processed by a module 
called Policing Function Module, which validates per VCI established eon.racts (services ) for incoming cells: e.g Peak 
Cell Rate. Sustained Cell Rate. Maximum Burst Rate. Other parameters such as Cell Delay Variation (CDV) and Cell Loss 
Rate (CLR) are guarantees provided by the box based on the actual design of the cards and .he switch. The contracts axe set 
by the network manager or via ATM signaling mechanisms. Cell Data from the policing function then goes. in me 
example of Fie. 9 .to a Cross Bar-type (Kg. 7, or Memory-based Switch (Fig 5). Cells are then forwarded to the egress port 
which has some requirements of shaping traffic to avoid congestion on the remote connection. To provide egress shaping, 
the design will have to buffer data on the egress side. Since ATM connections are based on a point-to-point basis, (he 
Egress shaper module also has to translate .he ATM Header. This is because .he next hop has no relationship to the ingress 
VO/VPI. 

Native Packet Interface on ATM Switch 

As mentioned in ,he "Background' section, if an ATM switch is to provide a method thai facilitates the routing of 
packets, .here have to be a. leas, ,w 0 points be.ween two hosts where packets and cells networks meet. This means that 
curren. cel. switching equipment has ,o carry interfaces tha, have native packet interfaces, unless the switch is sitting deep 
in the core of the ATM network. 1, is now in order, therefore. ,o examine ,he design of such a packet interface tha, connects 
to the ATM switch. 

A typical Packet interface on an ATM Switch is shown in Fig. 10. elaborating on packet interface on the cell 
switch. The physical interface would pu, incoming packets into a bufrer and then thev are fed to the -Header Lookup and 
Forwarding Engine". The packet-bascd forwarding engine decides the egress port and assccia.es a VCI number for cells of 
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tha, packe, The packet then gets segmented into cells by the Segmentation Uni,. From .here on. the packe, is .reated just „ 
in ,hc native Cel. Switching case, which involves going through a po.icine funcion end ,o the Switch Buffer before entering 
,he swi.cn. On ,hc egress si,.c. if the ceMs en.er a ceU interface, then the processing is jus, ,s cp.ained above (in ,he native 
cel. interface on ATM switch, .f the cel.s enter a packe, in.erfnce. ,hcn ,he ccUs have , 0 be reasserted in.o packets. These 
packets arc ihcn P u, inin various priori,/ queues and ihen empiied as in the packet swiich. 

Two types of packet interfaces on the ATM Switch should be examined. 
AAL5 Interface on ATM Switch 

A Router connected ,o ATM Swi.ch cou.d segment packets before sending the packet to ,he ATM Switch. In ,ha, 
case, packets would arrive a, the ATM Swi.ch in AALS forma,, before described. If the ATM Switch were to ac, as a 
Router and an ATM Switch. ,, would have to reassemble ,he AALS Packet and perform a routing decision on it. Once the 
ATM Switch/Router makes the folding decision on the AAL5 packet, it would then push i, through the ATM Swi.ch 
after segmenting it again. 

In AALS. perfect interface o„ an ATM Switch is shown in Fig. . I . , nC oming AALS cells are firs, policed on a per 
VC. based to ensure tha, the sender is honoring the contract. Once the policing function is done, an Assemb.er will 
assemble the ceHs of a VC. into packets. These packets are then forwarded to the forward.ng engine, which makes the 
forwarding decision on the assembled P acke, and some routing algorithm. The packet then travels the ATM Switch as 
mentioned in the Packet Interface on ATM Switch section, above. 
Difficulties in Processing Packets on Cell Switch 

Keeping the goal of the present mvention in mind. i.e. to achieve strict QoS parameters such as CDV and .atency 
nnd packe, ,o, .his scenn w,„ „. d.fHcu.ties of attempting to design for packets through a traditiona, ce.l swi.ch. 

According ,o P.g. . ■ . once the incoming AALS segmented packets arc assembled and a forwarding decision is 
made, they are .segmented in the -Segmentation Unit". Across the Switch, the AALS cei.s are thcn .assembled i„,o 
pockets before they are shipped on the egress w irc . This segmentation and reasscmb.y adds ,o the de.ay and unpredicub.e 
»nd unmeasurablc PDV (Packet Delay Variation, and ce., ,os, As car.ier mentioned, for packets ,o be provide QoS. it 
would need to support contract thn, includes providing measure PDV and de.ay. De.ay is caused due ,o the fact the ce..s 
Hnve to be reasserted. Each renssemh.y would have l0 . in bes, case, buffer an entire packe, wonh ofdata before ca.Hng i, 
complete and sending i, , 0 the QoS section. For a 8000 byte paeke, for e.amp.e. this cou.d resu,, in 64 usee de.ay i„ 
buffering on a 1 Gigabit switch. 
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The PDV fo, a packet through a cell switch is even more of a concern than ,hc additional delay, The « $ernb | y 
process car, be processing mul.iple packets at the same time from various ingress ports and packets, and this causes an 
unpredictable amount of PDV. essentially based on switch contention and the number of retries of sending cells from 

ingress 10 egress. 

Cell loss through the switch causes packets to get reassembled incorrectly and therefore adversely affects 
applications .ha. are real-time content specific. Most file transfer protocols do recover from a dropped packet (due lo 
dropped cells), but it causes more traffic on the switch due lo retransmissions. 

In summary, passing packets through an ATM switch does no. provide packets with the same CDV and latency 
characteristics as cells. It simply provides a mechanism for passing a packet path ihrough a cell switch. 

Design of Packet Switch 

A traditional Packet Swi,ch is shown j„ Fig. . 1 wi.h native packet interface cards. Packets are forwarded .o the 
Forwarding Engine via the physical interface. The Forwarding Engine makes a routing decision based on some algorithm 
and the header of the packet. Once the egress pon is decided, the packet travels to the egress via the Packet Switch, which 
could be designed in one of many ways (e.g. N by N busses, large centra, memory pool. etc.). On egress, the packets end up 
on different traffic priority Queues. Urcsc Queues are responsible for prioritizing traffic and bandwidth management. 

Cell Interface on Packet Switch 

The traditional packet switch, shown in Fig. 13 with AAL5 interfaces, provides a mechanism to allow cells to pass 

through the box so long ns the cells are of AAL5 ,y P e. There is no practical way of creating a virtual cell switch through a 

traditional packet switch, and pan nf .he present invention deals with the requirements of such an architecture. 

After AAL5 cells are policed for contract agreements, .hey are assembled into packets by an Assembly module. 

The packets thus created are then processed exactly like native packet interfaces. On the egress side, if packets have to go 

ou, cf the Switch as AAL5 cells, they are firs, segmented and then header translated. Final.y they axe shaped and sen, out. 

Difficulties in Processing Cells on Packet Switch: 

There are problems tha, a cell now faces as i, traverses a .raditional packc, swi.ch. It is extremely difficult for a 

traditional data switch, such ns a router, to support the QoS guarantees required of ATM. To illustrate the point, reference 

is made to the diagram .shown in befnrcdescrihed Fie. .3. One of the biggest challenges for a packet switch is to support 

AALI cells. The simple reason is that .he traditional Packcbased header Lookup and Forwarding engines do not 
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simultaneous recognize cel.s and packets: therefore. AAL5 cc.„ which can be convened in, packets are supported. This is 
3 severe rcsiriction in the cnpahility of the switch. 

Among .he features of cdls. are lhe CDV and ,he delay characteristics. Pushing ceHs through a tradition,, packet 
swi.ch adds more delay and an unprcd.ctable CDV. T*e packet sw itc h. as is inherent in its name, implies tha, packets of . 
various si,cs and number, are queued u P on the swi, ch . Packetizcd cel.s would then have no chance of maintaining any , ype 
of reasonable QoS through the swiich. 

Preferred Embodiment^) of the Invention 

The present invention. cxemp.ari.v il.ustrated in Figs. 4 and 14. and unlike all these prior systems, 
optimizes the networking system for transmitting both ce.ls and frames without internally converting one into the 
other. Furthermore, it maintains the strict QoS parameters expected in ATM swi.ches. such as strict CDV. Ia ,ency 
and cel. loss. Thi, is achi cvcd hy having a common ingress forward.ng engine tha, is context independent, a switch 
fabric tha, transfers cel.s and frames with similar latency, and a common egress QoS engine- packets lowing 
through the architecture of the invention acquiring cel. QoS characteristics while the ce.ls sti.l maintain theirQoS 
characteristics. 

TTc main components of the novel switch architecture of the invention, sometimes referred to herein by the 
acronym for the assignee herein. "NcoN." as shown i„ Rf . comprise |hc ingre „ ^ ^ ^ ^ ^ ^ 
egress par,. The ingress par, is comprised of differing physica, interfaces tha, may be cell or frame. A ce.l interface 
furthermore may be cither pure cell forwarding or a mixture of ccl, and frame forwarding where a frame is comprised 
of a collection of cel.s as defined in AAL5. Another pan of the ingress component is the folding engine which is 
common ,„ both ce.ls and frames. The switch fabric is common to both cells and frames. The egress QoS is ,.so 
common ,„ both ce.ls and frames. The final par, of egress processing is the physica. layer processing which is 
dependent on the , yP c of interface. Thu, the NcoN switch architecture of the invention describes those paru ,ha, are 
common to both cell and frame processing. 

TV key parameters required for ATM switching, as earlier explained, and ,hat are provided even i„ the 
case of simu.taneous packet switching are predict CDV. . 0 w Latency, .ow Cel. Loss and bandwidth 
management: i.e. providing a g^an.ccd Peak CeU Rate (PCR, The architecture of the invention. Figs. 4 and ,4. 
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however, contains two physical interfaces AAL5/I and packet interface at the ingress and egress. The difference 
between the .wo types of interface is the modules listed as "Per VC Policing Function" and "Per V C Shaping". For 
cell interfaces (AALI-5). the system has to honor contracts set by the network manager as per any ATM switch and 
also provide some sort of shaping on per VCI bases at the egress. Besides those physical interface modules, the 
system is identical for a packet or ., cell interface. The system is designed with .he concept thai once the dau 
traverses the physical interface module, .here should be no distinction between a packet and cell. Fig. 14 Hsu .he 
core of the architecture which has three major blocks, namely. "Header Lookup and Forwarding Engine". "QoS". 
and .••Switch" fabric, that handle cells and packets indiscriminately. The discussion, as i, relates to this invention, lies 
in the design of these three modules which will now be discussed in detail. 

Switch Fabric 

The inventions presented in before-refcrenced co-pending U.S. patent app.ications Serial No. 581.467. and 
Serial No. 900.757. both of common nssignee herewith, optimize the networking system for minima, latency, and can 
indeed achieve ,ero latency even ns data rates and port densities are increased. They achieve this equally well, 
morcovc, f cr ei(hcr 53 byIC cclls cr W hvlc t0 64K by ,„ ^ ^ ^ ^ 

the packet/ce.l as it is being w ri „ C n into memory, and providing the control information to a forwarding engine 
which will make switching, routine and/or filtering decisions as the data is being written into memory. 

Native Cells through the Switch 

The switch cells (AAL./5) of Fig. 14 are firs, policed at 2 as per .he con.rac. .he nc.work manager h,s 
ins.al.ed on a per VC. base, litis module could also assemble AAL5 cells into packets on selected VCI. Coming ou, 
of .he policing function 2 are either cells or assembled packets. Beyond this juncture of ,hc data flow, .here is no 
dis,i„e.ion between a packe. or a cell un.i. the data reaches the egress pon where data has to comply with ,he 
interface requirements. The cells are queued up in .he "NeoN Data Switch" 4 and the cel. header is examined for 
destination interface and QoS requirements. This information is passed on to the egress interface QoS module 6 via a 
Control Data Switch, so-laheled a. 8 .The QoS for . ce.l-.ype in.erface will simply ensure that cel. rate's beyond .he 
Peak Cel. Rate are clipped. The cells are then forwarded ,o the "Per VCI Shaping" module 10. where the cells .re 
forwarded ,„ the physical in.erface after they are shaped as per .he requirements of the next hop switch. Since .he 
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QoS module 6 docs not know from ,he control da,a whether a packet or a cell is involved, i, simply requests ,hc data 
from the NcoN Switch into the -Buffer 1 2." Tne control data informs the "Per VQ shaping " block 1 0 to do either 
header translation if it were a cell going into another VCI tunnel, and/or segmentation if the data was a packe, going 
out on a cell interface aml/nr reform shaping as per the remote end requirements. 

Native Packets through the NeoN Switch 

As packets enter the interface card, the packc, header is examined by a Header Lookup and Forwarding 
Engine module M.whilc the data is sen, to the NeoN data switch 4. The Ingress Forwarding Engine makes , 
forwarding decision about the QoS and the destination interf.ee card based on ,h= incoming packet header. The 
Forwarding Engine .4 a,so gathers alt information regarding the data packet, like NeoN Switch address. Packet QoS. 
Egress Header Transition informal, and sends i, across to the egress interface card. This information is carried as 
a control packet to the egress P or, through the smal, non-b.ocking contro. data switch 8 to the Egress QoS modu.e 6. 
which w.„ queue data as per the control packc, and send i, ,o ,he module .is.ed PHV a, ,hc egress. !f ,he packe, were 
to egress to a cell interface. ,h= packc, will be segmented, then header translated and shaped before i, .eaves the 
interface. 

Advantages of the NeoN Switch Architecture of the Invention 

As seen above, cel. and packc, flow .hrough ,hc box without any distinction exeep, a, ,he physica. 
interfaces, such tha, if cel. characteristics are maintained, then packe.s have the same characterises as ,he ce.., The 
packes „„v ,hus have mensurable and low PDV (Packet D c\zy Varia.ion) and low la.ency. with the architecture 
supporting packet .switching with eel! characteristics and ye, interfacing ,o exis.ing cel. interfaces. 

While the traditional packe, switch is unable ,o send non-AAL. cc..s as before exp.ained. AAL5 cells also 
suffer an ur.predic.ab.c amount of PDV and delay - .his being obviated by the NeoN Switch of the invention. 
Packets through n tradition,. ATM Switch also suffer the same .ong de.ays and unpredic.ab.e CDV - ag ,i„. „ 0 , the 
ca.se in the NcoN Switch of the invention. The modules tha, make this type of hybrid switching of the invention 
possible include the Ingress Forwarding Engine. ,he Egress QoS. and the Switch Fabric. 
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Ingress Forwarding Engine Description 

The purpose of ll!e , npic „ Forwarding Engine 14. Fig. 14. is ,„ parse the inpm f rame/ce „ and . base<J Qn 
predefined criteria and contents of ,he frame/cell, make a forwarding decision. This means ,ha. ,he input ce!l/frame i, 
compared against item, stored in memory. If a match is determined, then the contents of the memory location 
provides commands for action, on the cell/frame in question. The termination of the search, which is an iterative 
process, results in a forwarding decision. A forwarding decision is a determination of how l0 process the 
aforementioned frame/ceN. Such processing may include counting statistics, dropping the frame or cell, or sending 
the frame or cell to a se, of specified egress ports. In Fig. IS. this process is shown at a gross level. An in pu , stream 
of four characters is shown h c.d e. The characters have appropriate matching entries in memory. w ilh , characIcr 
input producing a pointer ,o ,he nex, character. The final character b produces a pointer to a forwarding entry. A 
different stream of characters than «.„, illustrated would have a different collection of entries in memory producing 
different results. 

The proposed Ingress Forwarding Engine 14 is defined to be a Parsing Micro-Engine. The Parsing Micro- 
Engine is divided into lw „ pam .. ,. raclivc n3rI and , ^ pnr , ^ ^ ^ . $ ^ ^ ^ ^ ^ 

logic that follows instructions written into the passive memory component which is composed of two major storage 
sections: l)Pars= Graph Tree ,PCT). Fig. . 8. and 2) Forwarding Lookup Table (FLT). Fig. .9. and a minor storage 
section for statistics col.ccion ll,c Parse Graph Tree is storage area tha, contains all ,he packet header parsing 
information, the results of tt l,,ch is an offset in the Forwarding Lookup. The FLT contains information about the 
destination pon. multicast information, egress header manipulation. The design is very flexible, e.g. in a datagram, i, 
can traverse beyond the IM and SA fields in the packet header and search into the Protocol field and TCP Port 
number, etc. The proposed PGT is memory that is divided into the 2" blocks with each block having 2" elements 
(where m < n ). Each element can be one of three types - branch element, leaf element, or skip element and within 
each block, there can be any combination of element types. 

While particularly useful for the purpose of the present invention, the Parsing Micro-Engine is generic from 
•he standpoint ,ha. i, c,a,nincs an arbitrary collection of bits and makes decisions based on that comparison. This 
can be applied, for example, to any tcxt-searching functions, searching for certain arbitrary words. In such 
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.-.pp.ica.ions. as an illustration, words such as bomb" or "detonate" in a letter or email may be sc3rchcd an(J jf , 
match is detected, the search engine mnv then execute predetermined functions such as signaling a „ a , 3rm . Jn facl 
the same memory can even be used to search for words in different languages. 

In the come., of the invention. Fig .4 il.ustrates having ,w„ entry points. One entry point is used to search 
for tex, in one language, while ,he second entry point is used ,o search for text in another language. Thus the same 
mechanisms and the same hardware arc used for two types of searches. 

There are two components to the datagram header search, software component and the hardware 
component. The software component creates the elements in the Parse Graph for every new route it finds on ,n 
interface. The software has to create a unioue graph starting from a Branch E.ement and ending on , Leaf Element, 
b-er denned, for each additional new route. The hardware walks the graph from branch to Leaf E.ement. clueless 

nhoulihc IP header. 

In fact there can be „,anv cn,,v points in the memory region a, illustrated in Fig. 21. The initial memory 
can be divided in,„ multip.e regions, each region of memory being a separate series of instructions used for different 
applications. In the case of Pig. 22. one oOhe regions is used for IP forwarding while the other region is used for 
ATM for^ding. At system s„, the memory is initialized to point to "unknown route", meaning that no forwarding 
information is available When a new entry is inserted, the structure of the Lookup Table changes. ,s illustrated in 
Fig 23. The i„ U s,ra,,e ,P address 209.6.34.224 is shown insened. Since this is a byte-oriented lookup engine, the 
nrst block has a pointer mserted in the 209 locat.o. The pointer points to a block tha, has a new poinlcr vaIue in lhe 
6 location, and so on unti, .„ of the 209.6.34.224 address is inserted. All other values still point , 0 unknown route. 
Insening the address in the IP portion of memory has no impact in the ATM portion of memory. 
As mentioned earlier, there are 2" blocks each with 2" elements in the parse graph tree. The structure of each e.ement 
is as shown in Fig. 17. with each element having the following fields. 

I. Instruoon Field: ,„ the current design there are three inactions resulting in two bi, instruction Held. The 

instruction description is as follows. 

- Branch E.emcn, <0O>. ,„ so far as the Micro Engine is concerned, the branch e.ement essentia.* points 
*e Awarding Engine to the next block address. A.so. within the branch element, the user may ,e, 
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Held, in lhe •„:,„, Forwarding ,„,„ ^ |g ^ ^ ^ ^ 

clen.cn.s 0f ,hc n,M F orwardin g In f orrnalio , For ^ jf ^ ^ ^ ^ ^ 
a- .P Header, and ,be branch elcmcnl was p , lced a| (hc enJ of dcstina(ion fieid (hen ^ ^ ^ 
urd,,c ,hc e g rcss P o f , Odd of .he forwardin g in f„. Fof ATM „vi,chin g . the uscr wou , d upda(e ^ 
cprcss port information at the end of parsing the VPI field. 
• Leaf Elcmcn, (0„. T,,, elcmen, inslfuels , he end of parsjng tQ micrQ en?;ne ^ h ^ 
informal accumula.ed durin g , hc search is ,hcn forward ,o ,he n„, , ogical block in |he d „ ign 

P=:k, header depend, on ,Hc number 0 f bloc* addresses ,he micro e„ g ine has ,o .oo k up. No, every 
■■.e. 0, n „cro e ng , nc w 0u!d have IO keep hopp , ng „„ non . signincan[ ncids onhe jncQmjng ^ 

d.n.agrnm and co„,in UC .he search. The skip si« is described below, 
d^rnne,^^ 

the protocol, etc. 

3. Incremental Forwarding Info Field- r>tir,n« k.,,4 

g mr« Field. Du„„ g hede, p»» nr . roilv „ dil , f inIorm: ,, ion ^ ^ 

'~" - — ' «~ 1*. **. « „ „„ iHt 

- 8 .,.^» t „K„„ ht „„ u , u „„ c „ teivcntU!i ^ natnii;j „„ ! „, ht(oconl>|diiiiram 

count could be decided based on lhe VCI A< ih,- - • j 

VCL As the pa,sm g „ done. iherefore.variou, piece, of .he forwarding 
mformahon are collecicd. and when a leaf node i< reach,* .h . • , 

node » reached, the rcsulung forwarding information is passed on to 
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■ he comrol path The width or the increment forwarding information (hereafter referred to as IFI) should be 
equal to ihc number of mutually exclusive incremental pieces in the forwarding information. 

4. Next Block Address field: This field is the next block address to lookup after the current one. The leaf node 
instruction ignores this field. 

5. Statistics Offset Field: In data switches, keeping flow statistics is as crucial as the switching data itself. 
Without keeping flow statistics it would he difficult, at best, to manage a switch. Having this statistics offset 
field allows one to update statistics at various points of the parse. On an IP Router, for example, one could 
collect packet count on various groups of DA. various Croups of SA. all ToS. various protocols etc. In another 
example dealing with an ATM switch, this field could allow the user to count cells on individual VPI or VCI or 
combinations thereof. If the designer wants to maintain 2' counters, then the size of this field should be s. 

6. FLT Offset Field: Tins is an offset into the Forwarding Lookup Table. Fig. 1 8. later discussed in more 
detail. The Forwarding Lookup table has all the mutually exclusive pieces of information that is required to 
build the final forwarding information packet. 

Reference Hardware Design Example 

The following is an example of a hardware reference design for the parser useful with practice of the 
present invention. The reference design parser has storage that contains the packct/ccll under scrutiny. This storage 
element for the cell/frame header information is to be two levels in depth. This creates a two-stage pipeline for 
header information into the destination .ookup stage of the Ingress Forwarding Engine. This is necessary because the 
Ingress Forwarding Engine will no, he able to perform a lookup until the entire header information has'been stored 
due to the flexible starting point capability. The ,w 0 stage pipeline allows the Ingress Forwarding Engine to perform 
a lookup on the present header information and also stores the next header information in parallel. When the pre«nt 
header lookup is completed, then the next header lookup can proceed immediately. 

The storage element stores a programmable amount or the incoming bit stream. As an example, the 
configuration may he 64 hytcs for IP datagrams and 5 bytes for cell., For an interface that handles both cells and 
frames, the maximum of these iwo values may be used. ' 

A DMA Transfer Done signal from each DMA channel will indicate to a state machine that it can begin 
snooping and storing header informat.on from the .ngress DMA bus. A packct/cell signal wi.l indicate that the 
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header ,o be „ wed U eilher a p , tkcl IlC3dcr 0f a ce „ hcadcf When hcnJcr jnformiiion ^ ^ ^ 

from n DMA channel, n request lookup will be asserted. 

For header lookups, ihcrc ,i„ He n rcgistcr-based able which ,i„ Sndicalc t0 Ihc ^ 
Engine Ihe lookup 5 , Ming pninl in , hc , p ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

number to indes , hls ,o,c. lh , infnrninIlfm ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

•he ,P header or Odds contained in ,He data portion „ f the packet. This capability, along wj|h |he skip ^ ^ 

filtering case* per interface.. 

A .uiinMe hardware lookup „ sho , n in Rg . 19 u ,„ g , ^ ^ ^ ^ ^ ^ 

d_o, ,, Fnnlllm p ,_ ci(hcr , nibb|t nf „ ^ m a iime ortj(hcr ^ ^ ^ ^ 

VPl*C. nc.,,, Th.s capability is pr0f fainnMbIe bv ^ ^ Iookup ^ ^ t ^ ^ ^ ^ 
, po.ntcd ,o h y one of si.ccn originating nodes, one per interface. TV or.ginating nodes arc Morcd in a 
programman.c .c ( i„c,W tab,, oiling software ,„ build these trees anywbere in thc ^ J(ruc(ure 

- -Mac co nI ,o,s lhc lookup process hy „ aminiop lhc M3(us n , g bj , ^ ^ ^ ^ ^ ^ 

-"---^^ 

"-^^ 

clock enables and mux controls lo activate. 

n»e result of ,he T.vser lookup is ,he Forunrdine Table lookup which « , i. u r 

. i-oic lookup uhich is a bank of memory yielding lhe 

feeding res ult . incIudinp thc r onvari|i infnrTnn , inn 

forwarding ID. In order lo optimize lookup lime 

performance, this lookup s,,pecnn he pipelined. 3 |.ovvin e the firs, stngeio start anorh , v 

■ mm singe 10 start another lookup in parallel. Thc 

Forwarding ID Held will be used in scvcr.il wavs First thr Men ,y. c- 

« r.rst. the MSB (Moil Significant By.c) of the Held is used to 

■nd.cate a unicast or multicast packet at the network interface level r 

•nterface level. For mult.ca.si packets, for example, thc Egress 
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Queue Manager will need ,o ln„k n, ,hi S hi, for queuing of multicast packets to multiple interfaces. For unicast 
packets, fo, example, si, hit, of ,he Forwarding ID can indicate ,he destination interface number and the remaining 
.6 bits wi„ provide a La.ve, 2 ID. The Layer 2 .0 wi„ be used by ,he Egress Forwarding .ogic ,o determine what 
Layer 2 header needs ,o he P , ep c«lc«l to .he packe, data. For paeke.s. these headers wi„ be added l0 |he packe , „ ., 
is moved from the Egress 1) MA F.FO (fir,, in. firs. ou„ .o the Egress Duffer Memory. For cells, the Layer 2 ID will 
provide .he transmit device with the appropriate Channel ID. 

For unicas, traffic, the Destination l/F number indicates the network destination interface and the Layer 2 
ID indicate, what type of Las er 2 header needs to added onto .he packet da.a. For multicast, the muhicast ID 
indicates both the type of Layer 2 header addition and which network interfaces can transmit the muhicast. The 
Egress Queue Manager wi„ pc,fo,m a Mu.ticas, ID table .ookup to determine on which interfaces ,he packe, will ge , 

transmitted on and what kind of Layer 2 header is put back on the packet da.a. 

An Example of Life of a Packet Under the Forwarding Engine 

I. is now in order ,o explain examples of a simple and a complex packet through the Forwarding Engine of 
■he invention. On power „p. Rf . ,9. ,„ 2" blocks of the parse graph axe filled with leaf elements pointing to an FLT 
offset that wi„ eventun.lv forward al, packets to the Con.ro. Process on the Network Card. This is a defau,, route 
of a.l unrecognised packets. Software is responsible of se.ting up the default route. The way i„ which the various 
elements are updated ,n,o ^ prnph mcmr)ry tt ,„ hc ^ ^ ^ ^ _ ^ ^ ^ 

r3CkCt *i«h mask 255.255.0 ,nnd a complex filler packet, aging the 

simple IP Packet. 

Simple Multicast Packet 

On power up. ,he en.ire blocks in .he Parse Graph Memory may be assumed to be filled with leafe.ements 
.hat poin, ,o , - offset of PIT which wffl route the packe, to the Network Processor. Let i, now be assumed for this 
cample. ,ha. .he ingress packet has n des.ina.ion ,P Address of 224.5.6.7. «n ,his case, the hardware wi.I ,ookup the 
224 ' k ° ffSel ' hC b,nCk ,,,,C fi '» lo ° ku " b, °* " *• ca..ed orieina.ing node, and find a .caf. The hnrdwaxe will 
end the search and .ook up the defau.t offset in ,he 224* .oca.ion and ,ook up the FLT and forward the packet to the 
control processor. 
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When the control processor forwards .subsequent packets of Destination IP address 224.5.6.7. ii will 
generate the graph shown in Pig- 21. 

The software first has to create .he parse graph locally. The parse graph created is listed as II29M3I. 
Vie software always looks up the firs, block a.k.a originating node. The offset in the first block is 224. which is the 
firs, hv ,c of .he destination IP header. 1, finds a default rou.c. - an indication for software ,o allocate a new block for 
all subsequent bytes of the destination IP address. Once the software hits a default route, i, knows .hat this is a link 
node. n...n the link node onwards, .he software has .o allocate new blocks for every byte it wants .he hardware , 0 
search for a matched des.ina.ion IP address. Through an appropriate software algori.hm. it finds .ha. 129.2. 131 are 
.he neat .hree available blocks to use. The sofiwarc will ,hen install con.inua.ion element w ilh BA or 2 in the 5* 

rfse. of block 1 29. con.inua.ion element with BA of 13 1 in 6* offset of block 2. and a leaf elemen. of FLT offset 5 
a. 7* offset of block 131. Once such a branch with a leaf is crea.ed. the node link is then ins.alled. The node has to 
be ins.alled las. in the new leafed branch. The node in this case, is a con.inua.ion elemen. with BA of 131 at offset 
224 of tlic I "block. 

Tne hardware is now ready for any subsequent packets with des.ina.ion IP address 224.5.6.7. even though it 
know, nothing about it. Now . when the hardware sees .he 224 0 f , lle lhc des.ina.ion IP address, i. goes to the 224* 
offset of I " block of .he parse graph and finds a con.inua.ion element whh BA of 129. The hardware will then go to 
the 5- offset (second byte of destination IP address) of the 1 29'- block and find another continuation element with 
BA of 2 The hardware will then go ... fi* offsc. (.bird bye or des.ina.ion IP address) of the 2 - block and find 
another con.inua.ion element with BA of .31. The hardware will then go to 7* offset (four.h byte of. des.ina.ion IP 
nddres, , of ,.,e . 3 1 " block and find a leaf element with FLT of 3. The hardware now knows .ha, it has completed the 
IP match and will forward the forwarding ID in location 2 to the subsequent hardware block, calling .he end of 
packet parsing 

I. should he no.ed .hat .he hardware is simply « slave of .he parse graph pu, in memory by software. The 
length of ,he search purely depends on ,he software requirement of parsing length and memory si,e. The adverse 
effects of such parsing are si,e of memory, and search time which is directly proportional to .he length of the search. 

In this case, .he search will result in the hardware effecting 4 lookups in Parse Graph and I lookup in FLT. 

Tnckct \\\\h Mask 255.255.255.0 
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Building upon n ihc parse graph in Fig. 20. a packet with an illustrative mask 255.255.255.0 and address of 
4.6.7.x is now installed. In this case, the software will go to the 4 ,h offset in the originating node and find a 
continuation element wiih B A of 1 29. The software will then go to offset 6 in block 1 29 and find a default FLT 
offset. The software then knows that this is a link node. From now on. it has to allocate more blocks in the parse 
graph, such as block 2. At offset 7 of block 2. it will install a leaf element with FLT 3. Tien it will install the link 
node consisting of writing a continuation clement with BA of 2 at offset 6 of block 129. 

When the hardware receives any packet with the header 4.6.7.x. it will look into the 4 th offset originating 
node and f,nd a continuation element with BA of 129. then look at the 6' h offset in block 129 and find a continuation 
element with BA 13 I. and then look at the leaf element at offset 7 with FLT of 3. Tins FLT will be of value 3 which 
is then forwarded to the Buffer Manager and eventually the Egress bandwidth manager, 
racket with Mask 255.255.0.0 

This subsection will huild upon the parse graph in Fig. 20 and install a packet with an illustrative mask 
255.255.0.0 and address of 4.8.x y. In this case, the software will go to the 4* offset in the originating node and find 
a continuation clement with BA of I 29. The software will then go to offset 8 in block 1 29 and find a default FLT 
offset. At this time the software know, that it has to install a new FLT (say 4)offset in the tf h offset of block 129. 

Tic hardware when receives any packet with the header 4.R.x.y it will look into the 4* offset originating 
node and find a continuation element with BA of 1 29 . then look at the leaf element of block with FLT of 4. and 
terminate the search. In this case Ihc hardware will do only 2 lookups. 
Complex Filicrcd Packet 

Now assume that there was a requirement to Filler a packet with header 4.5.6.8.9.x.y.z. 1 1. There are no 
restrictions to the above concept of parsing the packet, and the time it takes to parse the packet will increase since 
the hardware will have ,n read and compare 9 bytes. The hardware will simply keep parsing however until it sees a 
leaf element. The x.y.7. hy.es are blocks which contain continuation elements pointing to the next block with all 
continuation elements of x pointing to block y. all continuation elements of y pointing to block r. and all . 
continuation elements of 7. pointing to the block which has entry 1 1 as a leaf, and the rest being default. This is where 
the fork element comes into play and may be called u P to lookup the forwarding at the end of search 4.5.6.8. 
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Removing Simple IP Multicast Packets 

T*c removal „r packets is similar to ihe reverse of adding address lo .he parse graph, above explained. T*e 
psucdoeede for removal in this embodiment is as follows; 

Walk down m end of leaf remembering each block address and offset in block. 

FOR ( Pfum Leaf node lo originating node) 
IP ( only clement in block) 

set default FLT offset at the previous NODE offset address 
free the last block 
go to previous block 

ELSE 

set default FLT offset at last leaf, 
exit 

ENDIF 
END FOR 

Egress Bandwidth Manager 

Every I/O Module connects a NcoN port to one or multiple physical ports. Each I/O Module supports 
multiple traffic priorities injected via a single physical NeoN Port. Each traffic priority is assigned some bandwidth 
by a network manager, as il|„« ra ,«l in Fig. M. being labeled as the "QoS (Packet & Cell)-. T^e purpose of .his 
section is to define how bandwidth is managed on 'multiple traffic profiles. 
NcoN Queuing Concepts 

T>c goal of NeoN Queuing, of the invention, thus, is to be able to associate a fixed configurable bandwidth 
with ev cry parity queue and also ,„ ensure maximum line utilization. Traditionally, bandwidth enforcement is done 
in systems by allocating a fixed number of buffers per priority queue. This means lhai the enqucing of data on the 
Parity queues enforces bandwidth allocation. When buffers of n certain queue are filled, then data for .hat queue is 
dropped (by not enqueuing data on that queue), this being a rough approximation of the idea! requirement. 

TTicre arc many real life analogies to understanding the concept of QoS of the present invention, e.g. cars on 
n highway with multiple entry ramps or moving objects on a mulii-channeW conveyor in a manufacturing operation. 
For our purposes, let us examine the simple case of "cars on a highway". Assume that 8 ramps were .o merge into 
one lane at some pun, on the highly. In real life experiences; everyone knows that this could create traffic jams. 
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But if managed correctly (i.e with .he right QoS). then the single highway lane can be utilised for maximum 
efficiency. One way ,o manage this flow j s ,„ have no control, and have ii be serviced on a first come, first serviced 
method. This means that there is no distinction between an ambulance on one ramp and someone headed to the beach 
on another ramp. But in the methodology of the invention, we define certain preferential characteristics for certain 
entry ramps There arc different mechanisms that we can create. One is to send one car from each entry ramp in a 
round robin fashion, i.e. each ramp is equal. This means counting cars. But if one or these "cars" turns out to be . 
tractor trailer with 3 trailers, then in fact equal service is not being given to all entry ramps as measured by the 
amount of highway occupied. In fact if one entry ramp is all tractor trailers, then the backup on the other ramps could 
be very significant. So it is important to measure the size of the vehicle and its importance. The purpose of the 
"traffic cop • (aka QoS manager) is to manage which vehicle has the right of way. based on size, importance and 
perhaps lane number. The "traffic cop" can. in fact, have different instructions every other day on the lane entry 
characteristics based on what the "town hall manager" aka network manager has decided. To conclude the concept 
of QoS understanding. QoS is a mechanic uhich allows certain datagramsto pass through queues in a control.ed 
manner, so as to achieve a deterministic and desired goal, which may vary from application to application e.g. 
bandwidth utilization, precision bandwidth allocation, low latency, low delay, priority etc. 

The NeoN Queuing of the invention handles the problem directly. Neon Queuing views the buffer 
allocation 3 , an orthogonal parameter to the Queuing and bandwidth issue. NeoN Queuing will literally segment the 
physical wire into small time units cal.cd "Time Slice" fas an example, approximately 200 nanoseconds on OC48 - 
time of M byte packet on an OC<8). Packets from the back-plane are put into the Priority Queues. Each time . 
packet is extracted from a queue, a times.amp is also tacked along with that queue. The time stamp indicates distance 
in time from a 'Current Time Counter" in Time Slice Units, and when the next packet should be dequeued. The 
•distance in time" is function of a) packet size information coming in from the back plane, b) the size of Slice Time 
itself and c) the bandwidth allocated for the priority queue. Once a packet is dequeued, another counter is updated 
which represents the Next Time to Dequeue (NTTD) - such purely a function of the size of the packet jus, de- 
queued. K7TD is one for ccU-based cards, because ,11 packets are the same size and fi, in one buffer. This really 
proves tha, the NeoN Egress Bandwidth Manager is monitoring the .ine to determine exact.y what next to send. This 
mechanism, therefore, is a bandwidth manager rather than just a dequeuing engine. 
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The NcoN Queuinp of lfce prcscm invcntio , morcovcr hc ^ of ^ ^ ^ ^ 

bandwidth for d.ffcrcn, pr,o,i.,c<. u,,ng p.iori.y queuing for ABR (Available Bit Rate) bandwid.h. Added 
advances of lhc NcoN Queuing arc ,„„. wi.hin -he TDM mechanism. Md(h js ^ ^ 

coun," bu, on pac.ee, h> , c si , e -. 1 hi, granulari.y is a muc h beer repl.ca of ,„e ac.un. bandwid.h u.i.i^.ion ,„d 
allow, ,,, bandw.d.h calculation, ra.hcr ,han si,„ U la,cd/appro,ima.i„ns. T*e second - NcoN A(Jv3n(3£e . ^ ^ 
Ncwo* can dynamica.lv change ,he bandwid.h rcquircmcm . My ,„ , ^ ^ ^ _ ^ 

con.ro.. TO, i< rc„ih,c ,incc ,hc bandwid.h calculation, for p riorlly qi ,cue, , re no, a, a„ b a, ed on buffcr 
In Neon Q.cu.ng. ra.hcr. ,hc bandwid.h a„oca,,on is based on ,hc time s.icing ,he bandwid.h on .he physica, wire. 
Thi, .,ne of bandwid,h ma „ ag enK« is absolute.y necessary when running „ very high .ine speeds, .o keep ,i„ e 
utilization high 

Mathematics Used during Queuing 



First we will develop (he variables and 



constants being used in ihc ultimate mathematics. 



Symbols 


Description " i 


TS 


Time Slice of bandw,dih on w, rc used for calculations. <200nSec 
for OC48). 


NTT5 


Next Ttmc To Send. Th.s number in units of Ts representing a 
address to dc-queue from current time. 


BilTimc 


T.me per.od of a single b,t on the w.rc of the current 170 module 


An 


Delay factor in Number of TS. representing bandwidth 
calculations set by Network Manager, for priority Queue n. 


BWn 


Bandw.dth of Queue n in Percentage as entered or calculated by 
the CPU Software. 


Pn 


Number of Priority Queues. " 


TBW 


Total Bandwidth of the wire " ~ 


NTTD 


Next Time To Dequeue. " 


CT 


Current Time in TS units" " " 
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Consider first «hc user i„:cr.acc level ,o sec ho,- bandwidth is nl.ocatcd among* various priorities. ,hc user 
is normally given .he job of divid.ng 100% bnndwid.h amongst various priorities. The user could also be presented 
with breaking up .he entire bandwidth in bits per second (as an example for OC48. i, would be 2.4 C bi.s). In either 
case, some CPU software calculates a number pair. priori.y-An. from %-priority or mBi.s/seopriority. Since the CPU 
is doing this calculation, it can he easily changed based on the I/O module. The Bandwidth Manager does no. need .0 
know about ,hc I/O module type, only caring about the P riori,y. A n pair. Thus if a user connected to the NeoN port 
.ha, cannot handle data a, Ml line rate, the CPU can change this value .o adjust for the customer requirements. 
An = 100/BWn 

(I) 

Da.a (m form of packe. address) from the priority queues is dequeued on the cutout fifo Th. A* 
queue eng.ne calculate of the Next Time To Send for that queue is governed bv equn'ion f 21 leL 4u 
tsonesuch number for each queue, which gets updated cver^y time a^acke. is dc ^ed"S C^s 

NTTS, = ((Packet Byte Count • BitTimc) / (TS)) • A„) + NTTS, , (2) 

'vt"' B i™ C , iS * COn T\ ' hat n,ny bC fCd ty ' hc CPU ° n n° wer - u P- dc P^ing on the I/O Module 

m r oh £L:Z - " I WOlJ r d ,h3 ' WC W0U ' d haVC ,he abili '>"° cnf -« bandwidm to the 
100 of a TS lime, as „me approaches tnfin.ty. but with instant granularity always being TS time. 

Tbi. •< NCX, 7T DC , qaCUC " ' hC ' imC ,h •' ,, We 5,:m ,hl dea . U£u ' Process after the current dequeue 
Th,s is pnmanly based on the current time and the number of buffers in a packet just dc-qucued 

NTTD„ = ((Packet Byte Count • BitTimc) mnd(TS)) + CT (3) 
Queuing Processing 

I. i< now in order to decide ,.,c processing needed ,o queue addresses from the back-plane on to the Priority 
Queues, Tip 24. which depicts tl.c overall queuing and scheduling process. Control Data, which includes datagram 
addresses, from the "NeoN Control Data Switch", is sorted into priority queues based on .he QoS information 
embedded in the control Data, by the Queue Engine. The Scheduling Engine operation is rendered independent of 
«he Queue Engine which schedules datagram addresses through use of the novel algorithms of the invention listed 
further below. 

The queuing Engine hns the following tasks.* 
Hpriority^ 

u,,crma;ksse.for?q:::: rk Ca,CUl:Ui0nS Ca,CU * a,e WhC " «° back «» '»< based on 

Drop Packets Star, Dropping packets when the Priority Queues are full. 
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Tor each Priority Queue P„. there will be 3 'head pointer - pHcnuV and a tail pointer - pTatl/. Input Fifo 

feeds the priority Queues p A with buffer address from the bnck-ptanc. Additionally, there is a forward. For OC48 

rates, nnd assuming 64 byte pockets as average size packets, the following processing will be done in aboul 

200nSecs The preferred pseudo code of the invention for the En-qucuc Processor is as follows? 

Rend input Fifo. 

Find priority ol the packet 

IFfroom on queue) 

move buffer fmm Input Fifo to *pTail„ prioritv queue. 

Advance pTail ft . 

update statistics 

increment buffer count on queue 

IF(packct count on >= watermark of thai queue) 

set back pressure for that prioritv 

update statistics 

ENDIF 

FLSE 

move buffer from Input Fifo to drop queue. 
Update statistics 

ENDIF 

The verba! explanation of the psucdocode listed above. As each control packet is read from the 'Neon 
Control Data Switch" it is put omo one of N queues after it is verified for physical space available on the queue. If 
there is no room set on the queue the data is put on a drop queue, which allows the hardware to return addresses back 
to the originating port via the 'NcoN Control Data Switch'. Also a watermark is set. per queue, to indicate to the 
ingress to niter out non-prefcrred traffic. This algorithm is simple but needs to be executed in one TS. 
Scheduling Processing 

This section will |i*t ,| IC algorithm u<cd to dc-qucuc address from Priority Queues Pn onto the output fifo. 
This calculation also has to he dnnc during one TS. 

Wait here till CT == NTTD AND no back pressure from output fifo. // sync up 

X = FALSE ,/ • .I 

.. // some variable. 

FOR (all P. .High to Low) 

IFfpHcad. *=pTaiU 

IF(CT>=NTTS.) 

Dc-Qucuc (pHcad*) 

Calculate new NTTS B // scc cqU aua* (2) above. 

Calculate NTTD // scc equalion {3) ftbovc . 

update statistics 

X=TRUE 

ENDFOR 

ENDIF 
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ENDIF 
ENDFOR 
IF ( X == FALSE ) 

FOR ( all P.. High to Low) 

IFfpHcnd^pTailJ 

DcQucuc (pTaitJ 
update statistics 
X=TRUE 
ENDFOR 

ENDIF 
ENDFOR 

ENDIF 

IF(X=FALSE) 

update statistics 

ENDIF 
Update CT 

Tne function Dc-Queuc is conceptually a simple routine, listed below.- 

Dc-QueucCQ.,) 

•pOuiputQTnil«-+ = *pHcad ft ++ 

T^c explanation of the psucdocodc listed above is that there are two FOR loops in the algorithm the first 
FOR loop enforcing the committed bandwidth to the queue, and the second FOR loop serving for bandwidth 
utilization, sometimes catted aggregate bnndwidth FOR Loop. 

Examining firs, the Committed FOR Loop, the queues are checked from the Highest Priority Queue to the 
Lowest Priority Queue for available datagram to schedule. If a queue has available datagram, the algorithm will 
check to see if the Queues Time hn< to dequeue, by comparing its NTTS„ against CT. If the NTTS„ has fallen behind 
CT. then the queue is Dequeued; otherwise, the search goes on for the next Queue until all queues are cheeked. If a 
data from a queue is scheduled to go out. a new NTTS„ is calculated for that queue and a NTTD is always calculated 
when nny queue is de-queued. When a Network manager assigns weight for the queues, the sum of all weights should 
not be 1007. Since KITS, is based on datagram size, the output data per queue is a very accurate implementation of 
the bnndwidth set by the manager. 

Let us now examine the Aggregate FOR Loop. This loop is only executed when no queue is de-queued 
during ,he Commiued FOR loop, .n other word, on.y one dequeue operation is performed in one TS. In this FOR 
Loop, nil queues are checked from Highest Priority , 0 Lowes, Priority for available data ,o dequeue. The algorithm 
go. in this FOR Loop for one of two reasons: either there was no data in all the queues, or the 1*TTS. of all queues 
were sti.l ahead of CT fit was not time to send). If the algorithm entered the aggregate FOR Loop for empty queues 
.hen the second time around .he fate will he the same. However if the aggregate FOR Loop was entered because the 
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NTTS. «» no. reached for all queues .hen ihe aggregate will find .he highest priority such queue and de-queue it. 
also in ih.n ense it would update NTTS„ and calculate NTTD. 

The algorithm has buill in credits for queue .hat do no. have data ,o dequeue in .heir time slot: and debits 
for data .hat is de-queued in the Aggregate Loop. These credits and debits can accumulate over large periods of time. 
The debit and credit accumulation .imc is a direct function of the sij.c of NTTS. field in bits, for example a 32 bit 
number would yield 6 minutes in each direction at using !60nSec as TS (2 1: * !60nSec). Each individual queue 
could be configured to loose credits and/or debits, depending on the application this algorithm is used. For example 
ir the algorithm was to be used mainly for CBR type circuits one would want to clear the debits fairly quickly, where 
as Tor bursty traffic they could be cleared rather slowly. The mechanism for clearing debits/credits is very simple, 
asynchronously setting NTTS. to CT. If NTTS. is way ahead of CT. Queue has build a lot of debit. ,hen setting the 
NTTS. to CT would mean loosing all the debit. Similarly if NTTS. had fallen behind CT. Queue has build . lot of 
Credit, then selling NTTS. to CT would mean losing all the credit. 

Example of Implementing CBR Queue Using the Algorithm 

I. is now appropria.c to examine how I0 build a CBR queue out of the algorithm listed above, again 
referencing Fig. 24. Let it be assumed .ha, the ou.pu, wire is running a, OC4S speeds (2.4 G bi.s Per second) and that 
Queue I (highest Priority Queue) has been assigned to be .he CBR Queue. The way we configure .he weigh, on the 
CBR queue is configured by summing all the inpu, CBR Row bandwidth requirements. For sake of simplicity .here 
are 100 flows going through the CBR Queue, each with a bandwidth requirement of 2.4 Moils per second. The CBR 
Queue bandwidth will ,| icn he 2.4Mbi.s/scc Times ICO. i.e. 240Mbits per second (i.e. 10%). In other words, 

QRATE.K = Z Ingress Flow Bandwidth. 

A.= 100/10= 10. Based on Equation I 

NTTS. would result in 10 every time a 45 hy.e datagram is dequued. - Based on Equation 2. 
. NTTS. would result in 20 every time a 90 by.e datagram is dequued. - Based on Equation 2. 

NTTD would resul. in I every time a 45 byte datagram is dequeued. - Based on Equa.ion 3. 

NTTD would result in 2 every time a 90 byte datagram is dequeued. - Based on Equation 3. ' 

This shows ,ha. the queue will be de-queued very timely: based on datagram size and the % of bandwidth 
allocated to the queue. This algorithm is independen. of wire speed, making i, very scalable, and can achieve very 
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high data speeds. This alogori.hm also lakes datagram size into account during scheduling regardless or a the 
datagram being a cel. or a paclcet. So .ong as the network Manager sets the weigh, of.he qucuc as the sum of .11 
ingress CBR flow bandwidth, the algorithm provides the scheduling very accurately. 
Example of Implementing UBR Queue Using the Algorithm. 

It is very simple to implement a UBR queue using this algorithm. UBR standing for ,„e queue 
which uses the left over bandwidth on the wire. To implement this type of queue, one of N queues with 0% 
Bandwidth, and then this queue is dequeued when there is literally no other queue to de-queue. The NTTS will be 
set so far in the future that after the algorithm de-queues one datagrom the next one is never scheduled. 

QoS Conclusion 

As has been demonstrated, the algorithm of the invention is very precise in delivering bandwidth, .nd its 
granularity is based on the si,e of TS being independent ofCcll/Packe, information, and also provides all of the 
ATM services required: implying no, only packets also enjoy the ATM services bu, cells and packets coexist on the 

same interface. 

Real Life Network Manager Examples 

This section will now consider different Network Management bandwidth management scenarios^! well 
handled by the invention. Inso far as the NeoN Network control.er is concerned, there axe „ queues egress (as an 
example i, could be «>. each queue being assigned a bandwidth. The Egress Bandwidth Ma„a ger wi „ deliver tha, 
percentage very precisely. The Network Manager can also decide no, to assign ,00% of the bandwidth to all queues, 
in which case ,he left over bandwidth will simply be dis.ribu.ed on a high to low priority basis. Besides these two 
levels of control, the Network Manager can also examine statistics per priority and make strategic statistical 
decisions on it own and change percentage allocations. 
Exemplary Case I: Fixed Bandwidth 

in this scenario. 1 00% of the bandwid.h is divided into all queues. If a „ queues are fall a, a || timeyher, the 
queues wi„ beh^e exactly ,ike Fair Weighted Queuing. Tfte reason for this is that - the Egress Bandwidth Manager 
will deliver the percentage of the line bandwid.h as requested by the Network Manager, and since the queue, .re 
never em P ,y. the egress bandwidth does no, have time ,o execute the second FOR loop (Aggregate Loop), above 
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discussed. 



If .he que*. arc no, f„.| .„ the , ime . howeve , , hen during , he ^ ^ ^ ^ ^ ^ ^ ^ 
may be serviced ahead of its time without a charge against its bandwidth. 

A, an example,,.. New** Manager decided to allocate .2.5 % bandwidth to ev ery one of the eight 
queues, then the Network Manager has .„ provide to ,hc Egress bandwidth Manager; 

^.-Priority List of all A. one for each priority. 

Bit T,me Based on I/O Module Egress Bandwidth Manager is running on. 

For a tandwidth of ,2.5 ft. w 0u!d cnIculale to bc 8 . 00 (10 0/,2.5). For a OC48 Bi, Time would calculate 

lo bc 402 pscc. 

Exemplary Case 2: Mixed Bandwidth 

In -his example, no, all of.the bandwidth is divided into all of the queues. In fact, the sum of all fixed 
bandwidth on the queues is no, ,00,. of the bandwith ava.lablc. The Egress bandwidth Manager wil, deliver the 
constant Ha nd ,i dl H on the queucs up to lhe M an , |hcn ^ ^ ^ ^ ^ ^ 

raining band ^ lh . TOs gu3r3n!ceJ _ rf , ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

pr,ori,i,ed tr.rr.c. For queues tha, are not fu„ during the allocated time, tha, ban d w id ,h wi„ be Ios( t0 lhe aggregal£ 

bandwidth. 

Exemplary Case 3: No Mixed Bandwidth For All Queues 

In ,hi< sccnar.0,0* i, aIlocalcd „ fucd bandwid , h ,„ ^ ^ ^ ^ ^ ^ ^ ^ 
priori,i,ed queuing. The firs, For Loop listed in section 0 Scheduling . wi„ considered as NOP. 
Exemplary Case 4: Dynamic Bandwidlh 

I- Uti, illustration, the Network Manager may initially come up with No Mixed Bandwidth for a„ Queues 
- -Hen. as ,, starts to build committed bandwidth circuits.it may create fixed bandwidth queue, The sum of the 
-ouiremcn.s of bandwidth of the Hows „ an lngrcss pon ^ ^ ^ ^ ^ ^ ^ ^ ^ 

< P e,s port. The granu.arity of the allocatab, egress bandwidth is .arge.y dependent on the depth of the fioatin* 
PO-n. depth. As a „ example, i, may be assumed tha, two decima, places nu* suffice. TO| then imp.ies .0d> of one 
percent, and would calculate to be 240kHz for an OC48 line and 62 kHz for an OC. 2 «i„e. 
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h sho U .d h, ob,c f vcd Ihw , lle ahove eMes 3re eMmpIei 0n|y and , hc arp|ic3tion ^ a|gof|(hm ^ 
invention is not limited to these ones 

Fur.hcr m ndinc.-,„.,„< occur to |hosc ski , !cd .„ ^ ^ ^ ^ ^ ^ ^ ^ 

the spirit and sco r e of the invent.on n< defined in the appended claimv 
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Claims 



. -A me.hod of simuhancou.v pressing informa.ion confined in da.a cell* and data p3ckcts „ ^ ^ 
* -he egress of 3 da.a „ el wo,king ,y«crn. .ha, comprise, applyin{! holh lhe rcccj VC(j ^ ^ ^ ^ ^ ^ $ 
common da, swi lc , : conIrolling thc ,, leh for cel , anJ packe , d3(a . fof ., r din g , n d,c fimi „ atinglyu , ngcorninonne(work 
hardware a „« a.gori.Hms for forwarding . based on eomroI jnforma . on comained . n ^ ^ ^ ^ ^ 

forwards wuhou, impacting .he correct forwarding characteristics of e!,her. 

2. A method a, c.aimed in chin, . wherein ,he ce„ and packe, contro. i„f orTni ,io„ i s processed in . conjmon 
folding engine wi.H common a.gorithms inde P endcn, of context-sensitive information contained in the ceU or pack. 

3. A method a S Calmed in Cairn 2 wherein ,hc information from ,hc forwarding engine is passed ,o a network 

delay variation. 

4. A method as Ca.mcd in Cairn 3 wherei, qu a,„v of service information is included in lhc ^ 
from .He forward.ng enf ine and m ._ cd „ y , hc q _ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

lhc common algorithm. 

5- A method as claimed in Cairn 4 wncrcin a common p3rsinp aIgorjlhm ^ ^ ^ ^ ^ 
cell data and data packets. 

" mM ■ c " ,m "' ™ ci - * »-*■. *- w « «, „,,„„, 

space exists on the queue. 

returned by the switch to thc ingress of the network. 

«.A method a, Cai.ncd in Cain, , wHcrein a wa.en.ark .,e, for each „„.« ,„ instruct each ingres, ,o filter « 

non prefcrted data traffic. 

A method as chime- in Cairn 6 wherein handwid.h is aHocatcd for diffcrcn, priori hy p 3c ke, byle and 

b.ved upon time slicing thc bandwidth. 
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10. A method as claimed tn duim u wherein the network manager dynamically varies tin- bandwidth 
requirement. 

11. A method of processus, information contained in data cells and data packets received at the inp.ress of a 
dau nctworkme, system, that comprises. applying both the received data ceus and data packets U. a common 
dau forwardinp, and routinp. switch: manage, both cell and packet data switching in the. common switch using 
common hardware, common quality of service algorithms, and common forwarding algorithms; and 
controlling the packet switching independently of and without interfering with the . ell Jala .swilchinp. 

12. A method of processing packets of information from a forwarding switch unj queue managing the 
forward m C of the same, that comprises, as each packet is read from tin- switch, pulling th,. same into one of a 
plurality of queues after it is verified that available physical space ex.sls in the queue; placing the packet 
information in a drop queue should there be no such space and return..,,; the packet information through the 
switch- setting a watermark for each queue to enable the filterin,.. of non-preferred information traffic; and 
allocating for different priorities by packet byte si a- and based upon lime slicing the bandwidth. 

13. A system architecture apparatus for simultaneously processing information , ont.un. d in data cells and 
daU packets rece.ved at the ingress of a data networking system, said apparatus ha vine.. in . omhinalion. means 
forapplyin C both the rece.ved data cells and data packets from the ingress to a common data switch within the 
system; means for controlling the switch for cell and packet indiscriminate.!)-, for forwardinp. by a common 
algorithm based on control information contained in the cell or packet and without transforming packeLs .nto 
cells; and means for controll.ng with a common bandwidth m.uu,;,n.,,rt al,.,orithm both cell and packet data 
forwarding without impacting the correct forwarding i huracterislics of either. 

14. Apparatus as claimed m claim 13 wherein the cell and packet control information ,s procx-ssi-J in a 
common forwarding engine with common algorithms, independent of context-sensitive information contained 

in the cell or pjeket. 

15. Apparatus as cla.med in claim 14 wherein means is provide d for parsing the information ,mm the 
forwardinp. cnp.ine to a network egress queue manager and thence to a network egress transmit facility, and in a 
manner such as to provide minimal cell/packet delay variation. 

16. Apparatus as claimed in claim 15 wherein quality of service informat.on ,s included in the mformal.on 
passed from the forwardmg eng.ne and managed by the queue manager for both Cells and packeLs 
simultaneously Rise J upon llir common jlj;orilhm. 
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17. Apparatus as Caimed ,n claim , 6 wherein a CO mmon parsjng aIgorilhm . $ ^ ^ 

forwarding both cells and data packets. 

18. Appara.us is claimed in Cairn ,6 wherein the queuing m,„ag ing emp loy s processing ,ha, opera.es as each 

physical space exists on the queue. 

put in n drop queue and returned by the switch to the ingress of the network. 

20. Appa,a,us as Caimed in Cairn ,9 .Herein a wa.ermar, is sc, for each oueue , im such ingress ,o fi.ter 

out non-preferred data traffic. 

21. Appara.us as claimed in Ca.m , S .herein means is provided for anting bandwidth for differen, priori.ies b y 

packet byte size and based upon time slicing the bandwidth. 

22. A PP ara,us „ Calmed in Cairn 2 , wherein .he ne.work manager d y namica,l y varies .he bandwid.h reouiremen, 

23. Appara.us a, Caimed in Cm , 4 wherein .he ce.l da.a is of ATM fixed si;e uni, and .he packet d,a is of 

arbitrary size. 

2<.A PP ara„ a. Calmed in Cairn U wnerei, be.wcen .he ingress and .he swi.ch, a V C , func,on/assemb. y is 

interfaced. 

25. A PP ara,s as Caimed in Caim 24 wherein said assem bly connec.s no. on, y , .he swi.ch bu. a>so , a header 
•ookup and for.ard.ng engine f„r ho.n .he ce.l and packet da.a: w,h ,he engine conncc.ing .hrough a con.ro. da.a swi.ch 
and a qU al„v of service managing modu.e ,o a buffer, a.so inpuuing from ,he ou.pu. of .he swi.ch. 

26. Appara.us as claimed in claim 25 wherein .he buffer feeds a cell d-vr.k • • • . 

cr iccos a cell data VC shap.ng circuit thai connects wi.h the 

svstcm egress. 
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